Estimating Word Alignment Quality for SMT Reordering Tasks
نویسندگان
چکیده
Previous studies of the effect of word alignment on translation quality in SMT generally explore link level metrics only and mostly do not show any clear connections between alignment and SMT quality. In this paper, we specifically investigate the impact of word alignment on two pre-reordering tasks in translation, using a wider range of quality indicators than previously done. Experiments on German–English translation show that reordering may require alignment models different from those used by the core translation system. Sparse alignments with high precision on the link level, for translation units, and on the subset of crossing links, like intersected HMM models, are preferred. Unlike SMT performance the desired alignment characteristics are similar for small and large training data for the pre-reordering tasks. Moreover, we confirm previous research showing that the fuzzy reordering score is a useful and cheap proxy for performance on SMT reordering tasks.
منابع مشابه
Statistical Machine Reordering
Reordering is currently one of the most important problems in statistical machine translation systems. This paper presents a novel strategy for dealing with it: statistical machine reordering (SMR). It consists in using the powerful techniques developed for statistical machine translation (SMT) to translate the source language (S) into a reordered source language (S’), which allows for an impro...
متن کاملAlignment-based reordering for SMT
We present a method for improving word alignment quality for phrase-based SMT by reordering the source text according to the target word order suggested by an initial word alignment. The reordered text is used to create a second word alignment which can be an improvement of the first alignment, since the word order is more similar. The method requires no other pre-processing such as part-of-spe...
متن کاملReordering Matrix Post-verbal Subjects for Arabic-to-English SMT
We improve our recently proposed technique for integrating Arabic verb-subject constructions in SMT word alignment (Carpuat et al., 2010) by distinguishing between matrix (or main clause) and non-matrix Arabic verb-subject constructions. In gold translations, most matrix VS (main clause verb-subject) constructions are translated in inverted SV order, while non-matrix (subordinate clause) VS con...
متن کاملSistema Estadístico de Reordenamiento de Palabras en Traducción Automática
Nowadays, reordering is one of the most important problems in Statistical Machine Translation (SMT) systems. This paper exposes a novel strategy to face it: Statistical Machine Reordering (SMR). It consists of using the powerful techniques developed for Statistical Machine Translation (SMT) in order to translate the source language (S) into a reordered source language (S’), which allows for an ...
متن کاملIterative reordering and word alignment for statistical MT
Word alignment is necessary for statistical machine translation (SMT), and reordering as a preprocessing step has been shown to improve SMT for many language pairs. In this initial study we investigate if both word alignment and reordering can be improved by iterating these two steps, since they both depend on each other. Overall no consistent improvements were seen on the translation task, but...
متن کامل